Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support routing to Kubernetes clusters by request path #50567

Merged
merged 14 commits into from
Jan 27, 2025

Conversation

timothyb89
Copy link
Contributor

@timothyb89 timothyb89 commented Dec 24, 2024

This implements path-based routing for Kubernetes clusters as described by RFD0185. A new prefixed path handler is added that accepts base64-encoded Teleport and Kubernetes cluster names. The request is routed to the destination Teleport cluster using these parameters instead of those embedded in the session TLS identity, and then the preexisting handlers check authorization and complete the request as usual.

This removes the need for certificates to be issued per Kubernetes cluster: so long as the incoming identity is granted access to the cluster via its roles, access can succeed, and no KubernetesCluster attribute or cert usage restrictions are needed.

Fixes #40405

No changelog for this PR, will be added in #50898


Testing steps

  1. Ensure a Teleport proxy is running this branch with one or more k8s clusters attached. The Kubernetes node does not need to be running this branch, only the proxy.

  2. Build tbot against Machine ID: Support path-based Kubernetes routing #50898

  3. Create a new bot with appropriate permissions to access your kube node(s), e.g. tctl bots add foo --roles=kube-access

  4. Start tbot with the new kubernetes/v2 service:

    $ tbot start kubernetes/v2 --destination=./tbot-user --storage=./tbot-data --token=$token --join-method=token --kubernetes-cluster-name default --proxy-server=example.teleport.sh:443
    

    The --kubernetes-cluster-name flag can be repeated. Alternatively (or additionally), try a label matcher:

    $ tbot start kubernetes/v2 --destination=./tbot-user --storage=./tbot-data --token=$token --join-method=token --kubernetes-cluster-labels foo=bar --proxy-server=example.teleport.sh:443
    
  5. Examine ./tbot-user/kubeconfig.yaml. You should see one cluster: and one context: entry per matched cluster. The server: field should include a new URL suffix, /v1/teleport/<teleportCluster>/<kubeCluster>

  6. Try KUBECONFIG=./tbot-user/kubeconfig.yaml kubectl get pods. Also consider testing kubectl exec, port forward, etc.

This implements path-based routing for Kubernetes clusters as
described by [RFD0185]. A new prefixed path handler is added that
accepts base64-encoded Teleport and Kubernetes cluster names. The
request is routed to the destination Teleport cluster using these
parameters instead of those embedded in the session TLS identity, and
then the preexisting handlers check authorization and complete the
request as usual.

This removes the need for certificates to be issued per Kubernetes
cluster: so long as the incoming identity is granted access to the
cluster via its roles, access can succeed, and no `KubernetesCluster`
attribute or cert usage restrictions are needed.

[RFD0185]: #47436
@timothyb89
Copy link
Contributor Author

timothyb89 commented Dec 24, 2024

Outstanding TODOs:

  • Add manual testing instructions to PR description
  • Unit tests
  • More rigorously validate security impact. Particularly want to ensure per-session MFA guarantees remain intact.
  • PR for tbot changes to test this more easily. Should be a low-impact change, and my current hacked tsh build is not really an acceptable validation environment. PR is available here: Machine ID: Support path-based Kubernetes routing #50898

lib/kube/proxy/forwarder.go Outdated Show resolved Hide resolved
lib/kube/proxy/forwarder.go Outdated Show resolved Hide resolved
lib/kube/proxy/forwarder.go Outdated Show resolved Hide resolved
return "", "", trace.Wrap(err)
}

// TODO: do we care to otherwise validate these results before casting
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I think it would be better anw

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not totally sure what other validation steps we might want here - somehow I'd never looked into it before but we seem to impose very few restrictions on resource names. I've added a check to make sure the parameters decode to valid UTF8, at least. Downstream looks to be a map lookup from cached cluster details, so I don't expect significant risk.

assert: func(t *testing.T, restConfig *rest.Config) {
client := pathRoutedKubeClient(t, restConfig, clusterName, "a")
_, err = client.CoreV1().Pods(metav1.NamespaceDefault).List(context.Background(), metav1.ListOptions{})
require.ErrorContains(t, err, "cannot list resource")
Copy link
Contributor Author

@timothyb89 timothyb89 Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed a mild behavior change here, I think since this just wasn't possible before. Attempting to e.g. list resources on a cluster denied by roles now yields (kubectl get pods -v8):

I0114 20:55:58.643198   22226 request.go:1212] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"kubernetes cluster \"default\" not found","reason":"Forbidden","code":403}
I0114 20:55:58.643592   22226 helpers.go:246] server response object: [{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "kubernetes cluster \"default\" not found",
  "reason": "Forbidden",
  "code": 403
}]
Error from server (Forbidden): kubernetes cluster "default" not found

...however, attempting to list a nonexistent cluster returns:

I0114 20:51:57.008342   10127 request.go:1212] Response Body: I0114 20:51:57.008342   10127 request.go:1212] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Kubernetes cluster \"asdf\" not found","reason":"NotFound","code":404}
E0114 20:51:57.008352   10127 memcache.go:265] couldn't get current server API group list: the server could not find the requested resource
I0114 20:51:57.008357   10127 cached_discovery.go:120] skipped caching discovery info due to the server could not find the requested resource
I0114 20:51:57.008477   10127 helpers.go:246] server response object: [{
  "metadata": {},
  "status": "Failure",
  "message": "the server could not find the requested resource",
  "reason": "NotFound",
  "details": {
    "causes": [
      {
        "reason": "UnexpectedServerResponse",
        "message": "unknown"
      }
    ]
  },
  "code": 404
}]
Error from server (NotFound): the server could not find the requested resource

(and a Kubernetes cluster "$name" not found" is logged on the server)

Previously, you could only ever get certs for a cluster if the name was valid and the user is authorized to see it, and returns a consistent error otherwise before the cert is ever issued. Would we consider this a resource name oracle?

It seems like the two responses probably ought to be identical but changing them might be a mild breaking change to existing behavior (403 vs 404).

Also, don't allow path parameters to override the identity, if the
contains nonempty routing params.
@timothyb89 timothyb89 added the no-changelog Indicates that a PR does not require a changelog entry label Jan 17, 2025
Copy link
Contributor

@strideynet strideynet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me - although I can't claim any expertise in regards to Kubernetes Access. Let's make sure @tigrato reviews this before we merge anything.

lib/kube/proxy/forwarder.go Outdated Show resolved Hide resolved
return nil, trace.Wrap(err)
}
state := authCtx.GetAccessState(authPref)
if state.MFARequired != services.MFARequiredNever && (teleportClusterName == "" || kubeCluster == "") {
Copy link
Contributor

@tigrato tigrato Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code is checked after singleCertHandler (and not before) rewrites the request so hopefully it will never be empty so this check doesn't seem to change anything nor preventing users from using MFA to any cluster IIUC

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was attempting to keep some invariants in place, but you're right, it didn't help. I've replaced it with a check for a non-empty MFAVerified cert field in ensureRouteNotOverwritten().

The thinking when session MFA is required is:

  • If an MFA assertion is missing, routing info can be freely overridden and we rely on CheckAccess() to deny access as usual
  • If an MFA assertion is present and path parameters would override identity fields, deny access
  • If an MFA assertion is present and path parameters match, allow access

Without this, if you were somehow able to convince auth to issue an MFA cert with empty routing parameters, you could access any cluster via path routing.

(This is assuming I've understood MFAVerified properly, at least, but I'm reasonably certain of that. If not we'll need to do some potentially messy GetAccessState() checks.)

path = "/" + path
}

req.URL.Path = path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need to change RawPath

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now explicitly set req.URL.RawPath = "". The empty string should be enough to satisfy (or rather, not satisfy) httprouter's check: https://github.com/gravitational/httprouter/blob/teleport/router.go#L393

}

ctx := authz.ContextWithUser(req.Context(), userType)
req = req.WithContext(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
req = req.WithContext(ctx)
req = req.Clone(ctx)

deep clone the r.URL because it's a pointer and it's not copied during req.WithContext otherwise we are introducing a data race.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch, it's now deep copied as you suggested.

Deep clone the request and explicitly clear RawPath.
@timothyb89 timothyb89 added this pull request to the merge queue Jan 27, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 27, 2025
@timothyb89 timothyb89 added this pull request to the merge queue Jan 27, 2025
Merged via the queue into master with commit 58d25b5 Jan 27, 2025
42 checks passed
@timothyb89 timothyb89 deleted the timothyb89/kubernetes-path-routing branch January 27, 2025 21:45
@public-teleport-github-review-bot

@timothyb89 See the table below for backport results.

Branch Result
branch/v17 Failed

timothyb89 added a commit that referenced this pull request Jan 28, 2025
* Support routing to Kubernetes clusters by request path

This implements path-based routing for Kubernetes clusters as
described by [RFD0185]. A new prefixed path handler is added that
accepts base64-encoded Teleport and Kubernetes cluster names. The
request is routed to the destination Teleport cluster using these
parameters instead of those embedded in the session TLS identity, and
then the preexisting handlers check authorization and complete the
request as usual.

This removes the need for certificates to be issued per Kubernetes
cluster: so long as the incoming identity is granted access to the
cluster via its roles, access can succeed, and no `KubernetesCluster`
attribute or cert usage restrictions are needed.

[RFD0185]: #47436

* Remove routeSourcer in favor of identity values

Also: fixes various other review comments, adds comment explaining
auth strategy.

* Revert more dead code

* Revert more unnecessary changes; validate utf8 parameters

* Build fixes after merge with slog transition

* Add basic tests for Kubernetes path routing

* Fix imports

* Check identity parameters when per session MFA is enabled

Also, don't allow path parameters to override the identity, if the
contains nonempty routing params.

* Fix failing TestSingleCertRouting due to improper reuse of rest config

* Session MFA includes role based enforcement; added more test cases

* Don't allow route params to be overwritten if MFAVerified is set

* Code review feedback

Deep clone the request and explicitly clear RawPath.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/branch/v17 kubernetes-access no-changelog Indicates that a PR does not require a changelog entry size/md
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support connecting to multiple Kubernetes clusters using a single X509 cert
3 participants